Comparison of traditional and DNA metabarcoding samples for monitoring tropical soil arthropods (Formicidae, Collembola and Isoptera)

The soil fauna of the tropics remains one of the least known components of the biosphere. Long-term monitoring of this fauna is hampered by the lack of taxonomic expertise and funding. These obstacles may potentially be lifted with DNA metabarcoding. To validate this approach, we studied the ants, springtails and termites of 100 paired soil samples from Barro Colorado Island, Panama. The fauna was extracted with Berlese-Tullgren funnels and then either sorted with traditional taxonomy and known, individual DNA barcodes (“traditional samples”) or processed with metabarcoding (“metabarcoding samples”). We detected 49 ant, 37 springtail and 34 termite species with 3.46 million reads of the COI gene, at a mean sequence length of 233 bp. Traditional identification yielded 80, 111 and 15 species of ants, springtails and termites, respectively; 98%, 37% and 100% of these species had a Barcode Index Number (BIN) allowing for direct comparison with metabarcoding. Ants were best surveyed through traditional methods, termites were better detected by metabarcoding, and springtails were equally well detected by both techniques. Species richness was underestimated, and faunal composition was different in metabarcoding samples, mostly because 37% of ant species were not detected. The prevalence of species in metabarcoding samples increased with their abundance in traditional samples, and seasonal shifts in species prevalence and faunal composition were similar between traditional and metabarcoding samples. Probable false positive and negative species records were reasonably low (13–18% of common species). We conclude that metabarcoding of samples extracted with Berlese-Tullgren funnels appear suitable for the long-term monitoring of termites and springtails in tropical rainforests. For ants, metabarcoding schemes should be complemented by additional samples of alates from Malaise or light traps.

www.nature.com/scientificreports/ known barcodes ("traditional samples") or processed with metabarcoding ("metabarcoding samples"). We focused on ants, springtails and termites because (1) they are dominant arthropod groups in tropical soils, and (2) we previously generated local barcode libraries for these groups. Our emphasis was also on common as opposed to rare species, as only the former can realistically be well monitored in tropical rainforests 44 . As a basis for the reliable biomonitoring of the tropical soil fauna, we ask: 1. Are the species richness and faunal composition of focal groups (ants, springtails and termites) of traditional samples similar to that of metabarcoding samples, at least when considering common species? 2. In this local study, can species prevalence in metabarcoding samples (i.e., the number of samples in which a species is present) be used as a proxy for species abundance in traditional samples, at least for common species? 3. Do seasonal shifts in species prevalence and composition between the dry and wet season detected with traditional samples match similar seasonal shifts for metabarcoding samples? 4. Is there any association between species biomass and DNA read frequencies 38 ? 5. Can we understand the prevalence of "false positive" or "false negative" records 39,40 based on species traits or characteristics? 6. Do patterns related to all the above questions differ between ants, springtails and termites?

Material and methods
Additional details for all method sections are provided in Appendix S1.
Study site and field sampling. Assemblages of ants, springtails and termites were surveyed on Barro Colorado Island (9.15° N, 79.85° W; 120-160 masl) in Panama. The 1542 ha Barro Colorado Island is covered with lowland tropical forest and was created around 1910, when the Chagres River was dammed to fill the Panama Canal. Samples were obtained from the 50 ha ForestGEO vegetation dynamics plot, which is described Panama 50ha vegetation plot 10  www.nature.com/scientificreports/ in Ref. 31 . We considered ten locations (500 m sections of trails inside or near the plot) that are used for long-term arthropod monitoring and described in Refs. 33,44 (Fig. 1, Table 1). Each location was divided into ten sub-locations ( Fig. 1; see statistical methods below and Appendix S2 for explanation of this approach). For each of the ten locations, we randomly selected five sub-locations. At each of these selected sub-locations, we took two paired samples 10 cm distant from each other. We obtained 50 paired soil samples in March 2017, during the dry season. We repeated this sampling protocol in December 2017, during the wet season, with different random samples and obtained 100 paired soil samples for the two seasons. We used a 20 × 20 cm frame (400 cm 2 ) to delineate one sample and scoop all litter and soil up to a depth of ca. 5 cm or slightly deeper into a container, calibrating each sample to a final volume of 2 L. We restricted our sampling to the first 5 cm of the litter/soil as tropical soils tend to be shallow, with most of the faunal diversity concentrated in the top few centimeters 45 . To avoid contamination between samples in the field, we used laboratory gloves and disinfected all tools and receptacles with commercial bleach (Clorox de Centroamérica; hypoclorite of sodium 3.5%, hydroxide of sodium 0.3%), after which sampling gear was rinsed with distilled water, to clean bleach residues.
Each pair of samples consisted of two categories: "traditional samples", from which the soil fauna was extracted and sorted manually according to morphology and molecular data (see below), and "metabarcoding samples" which were analyzed using DNA metabarcoding. Traditional samples were set 4-5 h after collecting in Berlese-Tullgren funnels 46 . Arthropods were extracted during the first 24 h at ambient external temperature (i.e., without heat), and subsequently during the next 48 h with additional heat provided by 25 W bulbs. Berlese-Tullgren extracts were washed, arthropods were separated manually from soil debris and stored in 95% ethanol in a freezer at -20 °C until further analysis.
Metabarcoding samples were extracted similarly with Berlese-Tullgren with the difference that equipment was thoroughly disinfected with bleach and rinsed with distilled water before and after each sample extraction. Metabarcoding samples were stored at − 20 °C until being analyzed at the Hajibabaei laboratory at the University of Guelph, Canada, which occurred approximately three months after collection. Geographic coordinates of the samples are indicated in Ref. 33 and Fig. 1 summarizes the workflow of samples.
Processing of traditional samples. Ants, springtails and termites were all identified via morphological and molecular data (individual-based DNA barcoding). For morphological identification, ants were pinned and assigned to species using the ant reference collection of the ForestGEO Arthropod Initiative at the Smithsonian Tropical Research Institute. Springtails were first cleared in 10% KOH, then in lactophenol, and mounted on microscopic slides in Hoyer's solution. The slides obtained were identified at the Laboratorio de Ecología y Sistemática de Microartrópodos at the Universidad Nacional Autónoma de México (UNAM) using UNAMs extensive reference collections of springtails and specialized literature. Termites were stored in ethanol and soldiers identified using the termite reference collection of the ForestGEO Arthropod Initiative.
The standard DNA barcode region of the gene cytochrome c oxidase subunit I (COI) was sequenced for a subset of the specimens collected. DNA barcoding using Sanger sequencing was conducted at the Centre for Biodiversity Genomics, University of Guelph, using methods described in Wilson 47 . When possible, we sequenced a maximum of five individuals per species. In total, we generated 324 new COI sequences, corresponding to 328 specimens of 171 ants, 114 springtails and 43 termites. These sequences were added to BCI projects BCIFO, BCICL and BCIIS of the Barcode of Life Data System (BOLD; http:// www. barco dingl ife. org/ index. php), which summed 2,892 sequences (Appendix S1, Supplementary Table S1). These molecular data were used to confirm identifications based on morphology. Each species was attributed a Barcode Index Number (BIN) according to BOLD, which can be used as a proxy taxonomic unit in absence of binomial identification 14 (e.g., "Ectatomma ACH3273 ruidum").  50 . These were selected as proven to be effective to optimize recovery of species and genus richness in tropical arthropods 22,50 . Reactions had a standard mix as indicated in Appendix S1, for a total of 25 μL per reaction. A negative control (i.e., reaction with 2 μL water instead of DNA) was included to ensure PCR reagents were free of contamination. Reactions underwent 35 cycles of 94 °C for 40 s, 46 °C for 60 s, 72 °C for 30 s using an Eppendorf Mastercycler ep gradient S thermocycler. PCR amplification was visually confirmed through gel electrophoresis using a 1.5% agarose gel. PCR products were purified following the Min-Elute PCR Purification kit (Qiagen; Toronto, Ontario, Canada) standard protocol, eluting with 15 μL molecular biology grade water. A second round of PCR with primers that included the Illumina adapter sequence was run under the same conditions, using the purified product from the first round of PCR as template. Two PCRs with 35 cycles are sufficient for providing consistent and robust results for many samples across different sample types. This protocol has been used routinely with consistent results for biodiversity, ecological and biomonitoring studies (e.g., Refs. 22,49,51 ). Additionally, the high number of PCR cycles in the second round was practical because amplification was inadequate at lower cycles, and some samples failed. Purified second round PCR product was sequenced on an Illumina MiSeq using the v3 MiSeq sequencing kit (300 bp × 2).
Bioinformatics. The bioinformatics of metabarcoding remains in flux 27 . A wide array of metabarcoding tools exists, and their choice (and relevant parameters) is essential for meaningful results 52 . Available tools include, for example, Mothur 53 , Qiime 54 or Obitools 55 . However, an important restriction that pertains to all these tools is the use of customized databases for taxonomy classification. We relied on the mBRAVE cloud-based platform ("Multiplex Barcode Research and Visualization Environment"; http:// www. mbrave. net/ 56 ) since it seamlessly integrates BOLD data as reference datasets, which for BCI represent the best curated COI sequence datasets available. mBRAVE employs a typical workflow based on mature bioinformatic methods and is comparable to other available bioinformatics tools. The sequence reads resulting from metabarcoding were uploaded into the project MBR-BCISOIL of mBRAVE. mBRAVE parameters were optimized for classifying sequences with BOLD datasets which have already been curated and the various sequence errors removed. Hence, the stringent read processing was a bit more relaxed. All mBRAVE parameters are indicated in Supplementary Table S1, but the workflow can be summarized as follows.
The paired ends of the two overlapping amplicons were pooled (i.e., treated as one fastq file without any merging) as opposed to merging, since (a) we were expecting very high matches to mBRAVE datasets (see below) with variation within a BIN being small; and (b) a sensitivity analysis, where we varied most mBRAVE parameters one by one to optimize the greatest number of BINs recovered in mBRAVE datasets, indicated an optimal recovery of BINs with pooled sequences (Supplementary Appendix S1 and Supplementary Table S2). Reads were trimmed at each end (25 bp) to remove primer and adapter sequence, then filtered for a minimum quality score of QV20 and for a minimum and maximum size of 100 bp and 500 bp, respectively (Appendix S1). Denoising, dereplication and removal of chimeric reads, were performed with VSEARCH 57 , integrated into the mBRAVE pipeline (Appendix S1). Erroneous sequences were detected by a divergence range from reference sequences, with parameters as indicated in Supplementary Table S1. Sequences were identified to BIN level by matching them to reference BINs based on BOLD library datasets tailored for this study, using VSEARCH. These datasets included sequences obtained specifically for this study (2571 BINs for BCI arthropods) and publicly available datasets within BOLD (Appendix S1, Supplementary Table S1). BINs assignments were realized at 3%, a commonly employed threshold 58 . Reads without a match to reference BOLD datasets were clustered into operational taxonomic units (OTU) based on (1) the minimum OTU size (n = 1 read); (2) OTU threshold (maximum distance inside a generated OTU, a conservative 2%); and (3) exclusion from the OTU threshold when sequencing error introduces spurious haplotypes (chimeras and/or sequence errors, 3%). The BOLD datasets that we used for analyses are currently the best ones available for ants, springtails and termites from BCI (Supplementary Appendix S1 and Supplementary Table S1). It is important to note that, although sequence merging is often recommended to remove potentially erroneous sequences, we opted to pool our reads without merging as this retrieved a higher number of BINs. This strategy is possible because (a) we count with an extensive barcode reference library which has been continuously updated and curated by co-authors of this manuscript and represents the most complete insect barcode reference library available for Barro Colorado Island; and (b) this represents a first step towards a standardized and automated arthropod monitoring protocol where our primary goal is to detect reads matching known species barcodes.
Depending on the data and aims of the metabarcoding study, this approach may or may not be relevant. www.nature.com/scientificreports/ ally, we only considered them if they had been sequenced and had a BIN, given that some species failed to be sequenced (see "Results"). Common (as opposed to rare) species were defined as species belonging to the first quartile of the species-rank prevalence distribution 59 , for species sorted in traditional samples. The 95 metabarcoding samples resulted in a total of 122 sequencing runs (i.e., single fastq files produced by the sequencer through demultiplexing or from a single sample) and hence sampling effort was not strictly identical among locations sampled. However, unequal (but similar) sampling effort among locations did not affect any of our questions, and small differences in sample size among locations were ignored. Despite harvesting traditional and metabarcoding samples only 10 cm apart, we can still expect discrepancies in species abundance and faunal comparison between these samples. This results from soil organisms, such as ants, to be very patchily distributed when studied at the 1 m 2 scale 60 . To reduce this effect, our analyses emphasized data in which samples were grouped per location, as to evaluate differences at a higher scale. This issue is further discussed in Appendix S2.

Statistical analyses.
Question1: similarity between traditional and metabarcoding samples. To compare species richness between sample types (traditional and metabarcoding) and taxa (prevalence data), we computed rarefaction curves and estimates of total species richness with the R package iNEXT 61 . To visualize the differences in faunal composition between traditional and metabarcoding samples, we generated differential heat trees using the Metacoder R package 62 . These trees help identifying the taxonomic hierarchy with terminal nodes representing either BINs or binomial species. With a prevalence matrix, we used the function 'compare_groups' of Metacoder and the resulting mean difference between treatments to plot the differential heat tree. The differences, however, were formally tested using a Procrustes rotation, which rotates a configuration (for example ordination results) to maximum similarity with another configuration. First, for each dataset (traditional and metabarcoding), we pooled samples by the 10 locations at our study site (Table 1; Fig. 1) and considered matrices of species × locations, filled with the number of samples in which the species was recorded for each location (prevalence; maximum possible of 10 samples for each location). We then performed a non-metric multidimensional scaling (NMDS) for each of the traditional and metabarcoding matrices with the function 'metaMDS' of the R package vegan, followed by a Procrustes rotation of the two ordinations with the function 'procrustes' of the same package 63 . The function 'protest' of vegan was used to tests for non-randomness ('significance') between the traditional and metabarcoding NMDS (999 permutations). We performed Procrustes rotations separately for ants, springtails and termite data, and either for all species surveyed or restricted to common species. Question 2: regressions between abundance and prevalence. We considered the relationships between the following variables, for each species: (a) total abundance vs. prevalence in traditional samples; (b) total abundance in traditional samples vs. prevalence in metabarcoding samples; and (c) prevalence in traditional samples vs. prevalence in metabarcoding samples. We tested these three relationships for the three taxonomic groups together, and then for each group separately, considering either all species or only common species. We first checked linear and non-linear relationships with CurveExpert Professional 64 , considering the highest coefficient of determination (R 2 ) and the lowest Akaike information criterion corrected (AICc) of all models. Given that linear relationships provided the best fit for all models tested, we fitted ordinary least squares regressions to all tested relationships. Question 3: seasonal shifts. We first calculated, for each species collected both in the dry and wet seasons (n = 62), the difference in the prevalence in the wet season samples minus the prevalence in the dry season samples, separately for traditional and metabarcoding samples. We then fitted ordinary least squares regressions between the seasonal difference in traditional samples and that in metabarcoding samples. We performed these regressions for all species, ants, springtails and termites, but not separately for common species, because of low sample size. Eventually, to evaluate the similarity between the traditional and metabarcoding samples regarding shifts in faunal composition between the dry and wet season, we considered separately for traditional and metabarcoding samples, a prevalence matrix of species (n = 62) x season (dry or wet). We then calculated the correlation between the two matrices using a Mantel test, computed with the function 'mantel' of the R package vegan (10,000 permutations 63 ).

Question 4: regressions between biomass and read frequency.
We considered only ants, for which we had reliable measurements of different variables accounting for body size, as well as an allometric correlation of body length to body weight. Analyses were restricted to ant species present in metabarcoding samples for which we had good measurements (1-5 individuals measured per species, n = 28 species). We tested linear and non-linear relationships with CurveExpert Professional 64 , including, for each species, as dependent variables the total number of reads, the relative read abundance 65 , the total number of sequences and prevalence in metabarcoding samples, and as independent variables either body length (mm), pronotum width (mm), Weber's length (mm) or body weight (g), measured from ant workers in collections of the ForestGEO Arthropod Initiative.
Question 5: false positive and negative records. Assuming no taxonomic errors, we considered all species occurring in metabarcoding samples but not in traditional samples as "possible false positive", those occurring in traditional samples but not in metabarcoding samples as "possible false negative" and the rest of species as "well surveyed". Furthermore, we considered species as "probable false positive" and "probable false negative" when their prevalence in samples was ≥ 10, in metabarcoding samples and traditional samples, respectively (i.e., when at least one specimen was collected potentially at each of the locations). We performed a discriminant analysis to test whether morphological traits such as body length, pronotum width and Weber's www.nature.com/scientificreports/ length (Question 4) delineated some of the categories defined above. This analysis concerned only Formicidae and the species in categories "probable false negative" and "well surveyed", as sample size was too low for species in other categories (morphological measurements lacked for most of false positives). Discriminant analysis was performed with the function 'discrim' of Systat version 13.1 (Systat Software, Inc., Chicago, IL). Given that the barcodes for our false negative samples exist, we used the ecoPCR 66 program to run an in silico PCR to test whether or not our primers could amplify these regions. Barcodes were downloaded from the NCBI database to create a reference dataset to match with our PCR primer pairs allowing up to 15 mismatches between primer and target sequences. To test for differences between taxa (Question 6), whenever possible we kept analyses separate among taxa when investigating Questions 1-5.

Results
Traditional  Table S5). Doubtlessly, total species richness in these samples may be higher as most of arthropod species from BCI have not been barcoded yet and, hence, could not be validated with our library databases. The validation of the remaining OTUs not assigned to BINs is challenging, as many may be artefactual 67 . It is beyond the scope of the present contribution to discuss this issue, although we identify complex of species in our focal taxa that may be worth investigating in the future (Supplementary Appendix S2 and Supplementary  Table S6). Given the low resolution of OTU data, we now focus on the detection of validated BINs detected in metabarcoding samples for our focal taxa, which included 49, 37 and 34 BINs for ants, springtails and termites, respectively. About 98%, 37% and 100% of species of ants, springtails and termites, respectively, observed in traditional samples had BINs (DNA extraction failed for some species; full list in Ref. 33 ), and could hence be directly compared with metabarcoding samples. In sum, our study system consisted of 101, 52 and 39 species with BINs (hereafter "species") of ants, springtails and termites, respectively, that were amenable to comparisons between traditional and metabarcoding samples ( Supplementary Fig. S1). Table 1 summarizes the number of individuals, species and reads per location.

Question 1: similarity between traditional and metabarcoding samples. Observed species rich-
ness (95 samples), extrapolated species richness (200 samples) and total estimated species richness was higher in traditional samples than in metabarcoding samples for ants ( Fig. 2 Fig. 2). These patterns were broadly similar when we restricted the data to common species, suggesting also that the number of common species discovered by twice the number of metabarcoding samples would not be much different than currently detected in ca. 100 samples (Supplementary Fig. S2; for the identity of common species, see Question 2). The heat trees visualizing differences in the number of BINs detected in traditional and metabarcoding samples indicated that termites were better surveyed with metabarcoding samples (Fig. 3). When considering all species surveyed, the Procrustes rotations performed with the NMDS including traditional and metabarcoding data for ants, springtails and termites were not significantly correlated (Supplementary Fig. S3; all correlations with p > 0.05; statistics detailed in the figure). These patterns were similar when we restricted the analyses to the 34 most common species, except for common termite species, whose two ordinations were significantly correlated ( Supplementary Fig. S3). In this case the information provided by the Procrustes rotation was rather low, as this concerned only four species. Note that after removal of the outlier location "WHE2" in Collembola ( Supplementary Fig. S3), the Procrustes rotations for both all and common species were still not significantly correlated (all species: n = 52 species considered, m12 = 0.785, r = 0.46, p = 0.34; common species: n = 10, m12 = 0.713, r = 0.54, p = 0.19). Question 2: correlation between abundance and prevalence. There were strong positive relationships between total abundance and prevalence in traditional samples when all species were analyzed together (67% of variance explained), as well as for ant (69%) and springtail species (72%) analyzed on their own, but not for termite species (4%; Table 2). The relationship between abundance in traditional samples and prevalence in metabarcoding samples was weaker. The strongest positive relationship existed for all ant species (24% of variance explained), followed by all springtail species (16%). This relationship was not significant for termite species. Restricting the data sets to common species did not notably improve the fit of relationships (Table 2). There were also significant but relatively weak correlations between the prevalence of species in traditional and metabarcoding samples. The highest proportion of variance explained was observed for termites (35%), springtails (29%) and ants (19%). Data sets restricted to common species did not improve the relationships, with the notable exception of Formicidae (40% of variance explained; Table 2). Figure 4 summarizes visually the prevalence of species in traditional and metabarcoding samples.  Fig. 4). This was expected a priori since the proportion of rare species is always high in tropical insect surveys 68 (32.6% of species collected/detected as singleton in either traditional or metabarcoding samples in this study). More revealing, "probable" false positive and negative species represented 14.0% and 20.0% of the number of species that occurred in 10 or more samples (n = 50), and 12.5% and 18.8% of the total number of common species, respectively. Probable false positives included three ant, two termite and two springtail species, whereas probable false negatives included six springtail and four ant species (Table 3). The COI barcodes of all of these false negative species proved to be amplifiable using our primers in our in silico evaluation (Supplementary Table S7).

Discussion
The development of molecular methods allows for greater opportunities to monitor the lesser-known biodiversity, including arthropods from tropical soils. Previous metabarcoding studies have discussed topics such as profiling communities, beta diversity, dispersal, community assembly and body size [18][19][20][21][22] . However, few studies have been considering practical questions related to monitoring over time and the elaboration of time series. www.nature.com/scientificreports/ The current project represents one of the first attempts to develop protocols for monitoring in the long term the soil fauna in the tropics. After considering 95 spatially paired samples processed either traditionally or with metabarcoding, representing a total of 101, 52 and 40 species of ants, springtails and termites, respectively, we provide a list of positive items for developing a DNA metabarcoding protocol for the monitoring of tropical soil arthropods. Termites were better surveyed by metabarcoding than by traditional samples and springtails equally well. Doubling the number of samples would have not resulted in the detection of many more species. There was a positive correlation between the abundance of ant and springtail species in traditional samples and their prevalence in metabarcoding, although it was relatively weak (12-27% of variance explained). Seasonal shifts in species prevalence between the dry and wet season were overall correlated between traditional and metabarcoding samples. Further, changes in faunal composition between dry and wet season were also correlated between traditional and metabarcoding samples. Probable false positive species represented 14.0% of the number of species that occurred in 10 or more samples. Close scrutiny of these results indicated that 3 species could be classified as true errors (see below), lowering the proportion of probable false positive species to 6.3%.
Conversely, the following items appear more challenging for developing a comprehensive monitoring protocol. The overall species richness of metabarcoding samples was underestimated in comparison to traditional samples. This effect was mostly due to ants, emphasizing the selectivity of metabarcoding for particular taxa. This pattern also prevailed when we restricted the analyses to common species. This may result from different factors, including selective PCR primer amplification bias 50 , better efficiency of nuclear DNA over mitochondrial DNA to delineate species in ants 69 , amplification bias in AT rich Hymenopteran genomes 70 or extensive heteroplasmy in the mitonchondrial DNA of certain species complex, such as Ectactomma ruidum 71 . The faunal composition of metabarcoding samples was significantly different from that in traditional samples, and this pattern was similar when we restricted the data to common species. There was no correlation between the abundance of termite species in traditional samples and their prevalence in metabarcoding samples. Although there was a weak correlation between body weight and relative read abundance, no relation existed between body weight and prevalence in metabarcoding samples. This suggests that species' body weight may not be useful for refining estimates of species prevalence in metabarcoding samples. Probable false negative species represented 20.0% of the number of species that occurred in 10 or more samples. They included four ant species with high abundance in traditional samples, and that are well collected on BCI with different protocols (see below). It is difficult to interpret these results, but for ants the discriminant analysis suggests that false negative species tended to be smaller than well surveyed ant species. Patterns were often different between the three focal taxa, which challenges the possibility to develop a unique protocol that may be sound for a variety of taxa. Appendix S2 details some methodological considerations emphasizing possible limitations in our study. www.nature.com/scientificreports/ Thus, the list of positive findings motivates us to continue tackling the remaining challenges to optimize a monitoring protocol. We should now return to our specific questions as targeting key criteria for an efficient, unbiased, and reliable monitoring protocol.
Differences between traditional and metabarcoding samples. Regarding our first question, metabarcoding appears to be rather selective at recording certain taxa and species. For example, the proportion of ant species which are overall common (first quartile of the distribution) in litter samples on BCI (data from the ForestGEO Arthropod Initiative, 2008-2019) and which were also recorded in our metabarcoding samples was 43%, whereas the corresponding figure for termites was 75%. From the viewpoint of the field researcher, metabarcoding represents yet another "sampling method" with inherent biases, but with the obvious benefit that time and expertise needed for handling and sorting samples is greatly reduced. Apart from the sensitivity  www.nature.com/scientificreports/ of PCR primers, the reasons for the selectivity observed at the species level are not well understood 72 . Development of better, more general primers (or more specific primers to target focal taxa) could improve current assessments 50,73 . This is not necessarily detrimental to long-term monitoring in soil organisms with metabarcoding, but the limitations of metabarcoding should be acknowledged. Regarding our second question, there was a correlation between prevalence in metabarcoding samples and abundance in traditional samples, but the proportion of explained variance was relatively low. A partial explanation may be, in the first place, the biases inherent in metabarcoding workflow, as pointed out above. Nonetheless, while similarity in faunal composition between traditional and metabarcoding samples was rather low, our approach to Question 3 showed that the methods is sensitive enough to reveal a good correspondence in the seasonal shifts of species in traditional and metabarcoding samples. This issue is relevant to long-term monitoring by indicating that metabarcoding is sufficiently sensitive to replicate ecological patterns through time in the soil fauna. This pattern has been demonstrated in multi-year analysis of other systems such as macroinvertebrates from wetlands 51 . Hence, related to Questions 2 and 3, we tentatively conclude that it may be possible to use prevalence in metabarcoding samples to derive an estimate of, for example, the annual rate of change in the abundance of species that are well recorded (common) in metabarcoding samples.

Biomass and read frequencies.
With respect to Question 4, we observed that the most common species of ants with high total number of reads are soil nesting species that are common on BCI. However, the relation between ant's body weight and read frequency or prevalence in metabarcoding samples was either weak or nonexistent. For social insects, colony size may be more relevant than worker size or biomass, but we did not have pertinent data to test this relationship. While good correlation between read frequencies and species biomass has been reported for nematodes 74 , this relationship is less clear-cut, weak or absent for arthropods 24,25,27,72,75,76 . Hence, with the current technology available to us, biomass data are unlikely to refine estimation of species prevalence (or biomass) among metabarcoding samples. Bista et al. 76 showed that shotgun mitogenomic sequencing provides better relationships between reads and biomass than metabarcoding, a finding later confirmed and developed by Ji et al. 77 . Hence, PCR-free methods may potentially offer an alternative avenue towards improved quantification.
False positives and negatives. The prevalence of false positive species is always a risk, because of contamination or errors during PCR and sequencing 40 . For long-term monitoring this is not an acute problem, as monitoring is often restricted to common species well amenable to statistical analysis 44 . "Probable false positives" as defined in our study can also sometimes be ruled out with a priori knowledge of the local fauna. In evidence of this, Ectatomma ACH3273 ruidum and Cylindrotermes macrognathus Snyder, 1929 were most likely missed in traditional samples, being part of a complex of cryptic species that can only be delineated with BINs (Ectatomma spp.) or difficult identification (C. macrognathus). The prevalence of Atlantitermes sp. 4 on BCI and of Collembola ADU7662 may be plausible, but the other three species, including the two Tetramorium which are invasive and urban pests from Asia 78,79 , are most likely errors (Table 3).
Explanations for the prevalence of "probable false negative species" are challenging. These species were reasonably well collected on BCI by the ForestGEO Arthropod Initiative during the period 2008-2019, and they all have BINs included in the reference libraries used. Thus, their absence in metabarcoding samples is unlikely to be due to local scarcity. The prevalence of probable false negative species is more problematic for long-term monitoring 23 . These errors may be generated by primer biases 39 , but this appears less likely in our study, because metabarcoding samples recorded congeneric species well. Further, the in silico evaluation revealed the primers to be capable of amplifying the COI barcodes of the probable false negative species. Another source of error may occur when genuine metabarcodes are assigned to an incorrect taxon due to incomplete or inappropriate reference databases 39 , or to the use of inefficient assignment methods 41 . In our study, false negatives may result mostly from inappropriate referencing of metabarcodes, particularly for species complexes. To reduce the number of false negatives, Ficetola et al. 40 advised to perform multiple PCR extractions and to use species occupancy models to refine the results. Unfortunately, the occupancy probability of false negative species cannot be estimated by such models, since these species were never encountered in our metabarcoding samples.

Conclusions
Targeting Question 6, we observed different patterns in the distribution of ants, springtails, and termites in traditional and metabarcoding samples. While Berlese-Tullgren is an efficient method for extracting soil springtails, termites are best surveyed with transects 80 , and ants with Winkler extraction 34 . Thus, the efficiency of metabarcoding may ultimately be improved by choosing a more selective sampling method appropriate for each target taxa.
Our study indicates that metabarcoding of samples extracted with Berlese-Tullgren may be well suited for the long-term monitoring of springtails and termites in tropical rainforests, provided that baseline data, including DNA reference libraries, exist 43 . Vouchered reference libraries are indispensable in this context, as they allow compilation of species traits and delineation of functional groups for a sound interpretation of time-series 32 . For ants, this issue is less clear-cut because many species lacked in metabarcoding samples. Technological improvements that may provide longer sequences and improved detection in reference to DNA libraries may improve ant detection. Irrespective of this issue, monitoring ants with metabarcoding may be possible for certain species with samples obtained from Berlese-Tullgren, and this may be complemented with metabarcoding samples obtained with other collecting methods such as Winkler, Malaise or light traps targeting alates 81 . Metabarcoding with Berlese-Tullgren may also provide additional data about non-target soil taxa not considered in this study. Sample processing can be simplified by obtaining bulk soil samples without extracting arthropods, although in this case